-
Notifications
You must be signed in to change notification settings - Fork 805
[SYCL] Do not call urDeviceRetain()
in case of subdevices
#20065
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
[SYCL] Do not call urDeviceRetain()
in case of subdevices
#20065
Conversation
b07c1d4
to
51a4b37
Compare
51a4b37
to
ab4b9b9
Compare
ab4b9b9
to
7c033d0
Compare
7c033d0
to
054acf1
Compare
054acf1
to
7eb40c4
Compare
urDeviceRetain()
in case of subdevices
7eb40c4
to
1e745dc
Compare
@pbalcer please review |
1e745dc
to
1ab563c
Compare
sycl/source/detail/device_impl.cpp
Outdated
: get_info_impl<UR_DEVICE_INFO_PARENT_DEVICE>()), | ||
// TODO catch an exception and put it to list of asynchronous exceptions: | ||
MCache{*this} { | ||
if (IsSubDevice) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The early return seems abrupt. It also loses your intention. Why not just say if( !IsSubDevice){ urDeviceRetain }
then it's clear WHY we are tracking isSubDevice
.
Otherwise, someone in the future could easily refactor this code and the retain could end up higher in the function and the issue reintroduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@ldorau - please do not "force push". That messes up the continuity of the reviews on github. If you are having to "force push" it probably means you are doing something wrong on your end. |
1ab563c
to
1344429
Compare
OK, so what should I do? Add more commits titled:
? |
urDeviceRetain(MDevice) should not be called in the constructor of device_impl in case of subdevices, because their reference counter does not drop to zero and subdevices are not destroyed nor freed now, what causes memory leaks. It fixes URT-961. Signed-off-by: Lukasz Dorau <[email protected]>
sycl/source/detail/platform_impl.cpp
Outdated
} | ||
|
||
device_impl & | ||
platform_impl::getOrMakeSubDeviceImpl(ur_device_handle_t UrSubDevice) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please implement a helper function that taken an IsSubDevice bool to eliminate duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
b6f2cce
to
119f6a2
Compare
@cperkinsintel OK, a new commit added with fixes |
@intel/llvm-gatekeepers please consider merging |
if (!IsSubDevice) { | ||
// Interoperability Constructor already calls DeviceRetain in | ||
// urDeviceCreateWithNativeHandle. | ||
getAdapter().call<UrApiKind::urDeviceRetain>(MDevice); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The destructor device_impl::~device_impl
calls urDeviceRelease
, which is meant to balance the urDeviceRetain
in the constructor. However, after this change, urDeviceRetain
is no longer called for subdevices in the constructor, while urDeviceRelease
is still invoked for them in the destructor. This imbalance is somewhat concerning. It seems that the actual root cause of the memory leak may lie elsewhere. Could you please clarify what you believe the underlying issue is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that when a device is created RefCount
is equal 1 without any call to retain()
, but it is destroyed when RefCount
is equal 0, so there should be one more call to release()
than a number of calls to retain()
... or the implementation of class RefCount
is wrong ?
class RefCount {
public:
RefCount(uint32_t count = 1) : Count(count) {}
...
uint32_t retain() { return ++Count; }
bool release() { return --Count == 0; }
void reset(uint32_t value = 1) { Count = value; }
private:
std::atomic_uint32_t Count;
};
...
ur_result_t urDeviceRelease(ur_device_handle_t Device) {
// Root devices are destroyed during the piTearDown process.
if (Device->isSubDevice()) {
if (Device->RefCount.release()) {
delete Device;
}
}
return UR_RESULT_SUCCESS;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also worth pointing out that only subdevices are really ref counted:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"(only sub-devices being ref counted seems like implementation detail of the adapter)"
That's the reason. I pointed it out because it's possible that the existing sequence of creates/retains/releases was always broken, and we never noticed because the leak happens in the rare case of subdevices. So in a way, I'm agreeing with your earlier comment.
@intel/llvm-gatekeepers please consider merging |
urDeviceRetain(MDevice)
should not be calledin the constructor of
device_impl
in case of subdevices,because their reference counter does not drop to zero
and subdevices are not destroyed nor freed now,
what causes memory leaks.
It fixes URT-961.